Another Factorial File Compression Experiment Using SAS
نویسنده
چکیده
Continuing experimental work on SAS data set compression presented at NESUG in 2004, I designed another two-factor factorial experiment. My first factor compares the three kinds of data set compression offered by SAS on UNIX; the SAS DATA set OPTIONS COMPRESS=CHAR, COMPRESS=BINARY and SAS sequential format files created with the V9TAPE engine, three UNIX file compression algorithms; compress, gzip, and bzip2 and a control without any file compression. My second factor compares four kinds of SAS data sets; all character variables, all numeric variables, half character and half numeric, and all numeric in which LENGTHs shorter than 8 were used for smaller values. bzip2 minimized compressed file size for all four control SAS data sets. Only the three SAS file compression methods can be used to give other SAS users read access to compressed SAS data sets without giving them write permission to these files. SAS COMPRESS=BINARY reduced compressed file size more than SAS COMPRESS=CHAR on all four variable type treatments tested including SAS data sets containing only character variables.
منابع مشابه
A Measurement Study on the BitTorrent File Distribution System
The file distribution protocol BitTorrent(BT) is very popular nowadays, and people are used to sharing files with BT clients everyday. In this project, a two factor full factorial design on workloads and BT clients was proposed to test which effect is significant for BT downloading. The first part presents the background knowledge of Peer-to-peer(P2P) and BT and the introduction of several popu...
متن کاملXML for Electronic Submission: A Possible Better Alternative to SAS Transport Files
The eXtensible Markup Language (XML) is the next generation of ASCII text file. It was developed to describe, transfer, deliver and share data across different platforms and applications. Its hierarchical data structure allows XML to establish relationships among elements and other inside/outside sources. As the use of XML gains momentum, it may be accepted by the FDA as a data format for elect...
متن کاملAn Algorithm for Screening of Genes and Clusters from Microarray Experiments
This paper presents an implementation, using the SAS System, of the “cluster scoring” method proposed by Tibshirani, Hastie, Balasubramanian, Eisen, Sherlock, Brown and Botstein (2002) for use in microarray experiments. The program is designed in a modular fashion, using SAS Macro Language, so that implementations for specific experimental cases can be added with ease. Current development accom...
متن کامل127-31: Using the SAS® System for Experimental Designs for Multi-Component Interventions in Medicine
We demonstrate how to use SAS to design experiments for multicomponent interventions for multifactorial health syndromes. Multifactorial syndromes are health conditions that have more than one risk factor related to the outcome and require interventions with several components that target different risk factors. The design and analysis of multicomponent trials are complicated by the number of f...
متن کاملNo More Downloading - Using SAS/ODS to Create Graphs and HTML Documents for OS/390 Systems
With the advent of mainframe web server software, such as IBM's 'OS/390 HTTP Server', and with SAS Release 8's new ODS facility, it is now possible to create HTML documents in OS/390 data sets PDS/E, sequential or HFS. Graphs and web pages stored in these OS/390 data sets can be viewed via a web browser. It is no longer necessary to download the documents to another type of platform for viewing...
متن کامل